Configuring direct access to hardware accelerator in an o-ran system

ABSTRACT

Some embodiments provide various methods for offloading operations in an O-RAN (Open Radio Access Network) onto control plane (CP) or edge applications that execute on host computers with hardware accelerators in software defined datacenters (SDDCs). At the CP or edge application operating on a machine executing on a host computer with a hardware accelerator, the method of some embodiments receives data, from an O-RAN E2 unit, to perform an operation. The method uses a driver of the machine to communicate directly with the hardware accelerator to direct the hardware accelerator to perform a set of computations associated with the operation. This driver allows the communication with the hardware accelerator to bypass an intervening set of drivers executing on the host computer between the machine&#39;s driver and the hardware accelerator. Through this driver, the application in some embodiments receives the computation results, which it then provides to one or more O-RAN components (e.g., to the E2 unit that provided the data, another E2 unit or another control plane or edge application).

CLAIM OF BENEFIT TO PRIOR APPLICATIONS

This application claims the benefit of U.S. Provisional Patent Application 63/157,351, filed Mar. 5, 2021; U.S. Provisional Patent Application 63/157,600, filed Mar. 5, 2021; U.S. Provisional Patent Application 63/176,859, filed Apr. 19, 2021; and U.S. Provisional Patent Application 63/180,627, filed Apr. 27, 2021. U.S. Provisional Patent Applications 63/157,351, 63/157,600, 63/176,859, and 63/180,627 are incorporated herein by reference.

BACKGROUND

In telecommunications networks, the Radio Access Network (RAN) performs more and more functions with each iteration of the telecommunications standards. That is, in order to enable the advantages of 5G over previous standards, the 5G RAN performs various additional functions. These RAN functions are situated between user devices and the core network, and are thus often performed at the base stations (e.g., cell towers) where computing power can be limited.

BRIEF SUMMARY

Some embodiments provide various methods for offloading operations in an O-RAN (Open Radio Access Network) onto a control plane (CP) or an edge application that execute on host computers with hardware accelerators in software-defined datacenters (SDDCs). For instance, at the CP or edge application operating on a machine executing on a host computer with a hardware accelerator, the method of some embodiments receives data, from an O-RAN E2 unit, for which an operation has to be performed. Examples of such a hardware accelerator include a graphical processing unit (GPU), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), and a structured ASIC. The machine and the E2 unit execute on host computers in one SDDC in some embodiments. In other embodiments, the machine and the E2 unit operate in two different physical locations. For example, the machine executing on the host computer in a first location and the E2 unit operating at a second location close to a cell site of the O-RAN, the second location not having computers with hardware accelerators to perform complex operations including the received operation.

The method uses a driver of the machine to communicate directly with the hardware accelerator to direct the hardware accelerator to perform a set of computations associated with the operation. This driver allows the communication with the hardware accelerator to bypass an intervening set of drivers executing on the host computer between the machine's driver and the hardware accelerator. Through this driver, the application in some embodiments receives the computation results, which it then provides to one or more O-RAN components (e.g., to the E2 unit that provided the data, another E2 unit or another application).

In some embodiments, the machine is a virtual machine (VM) that executes on a hypervisor. In these embodiments, the intervening set of drivers include at least one driver of at least one of an operating system or hypervisor executing on the host computer. In other embodiments, the machine is a Pod and the intervening set of drivers include a driver of an operating system executing on the host computer. In some of these embodiments, the intervening set of drivers includes a hardware accelerator driver of a hypervisor executing on the host computer. Also, in some embodiments, the Pod operates within a VM and the intervening set of drivers includes a hardware accelerator driver of the VM. The method uses the hardware accelerator driver in some embodiments by using a hardware accelerator driver of the Pod or by using a hardware accelerator driver of the VM.

In some embodiments, the method uses the machine's driver by using a passthrough feature of a program on top of which the machine executes, in order to allow the machine's driver to communicate directly with the hardware accelerator driver without going through any set of intervening software processes of the program. This program in some embodiments is either a hypervisor or an operating system executing on the host computer.

In some embodiments, the method receives the data and provides the result through a RAN intelligent controller (MC) SDK executing on the machine. The MC SDK provides a set of connectivity APIs (e.g., a framework) through which applications (e.g., xApps or other control plane or edge applications) can communicate with the distributed near real-time (RT) MC. Through the MC SDK, the result is provided to the O-RAN E2 unit that provided the data in some embodiments, while it is provided to another application operating on another O-RAN E2 unit or machine executing on the same host computer, or on another host computer as the application that uses the hardware accelerator to perform the operation.

In some of these embodiments, the method uses the machine's driver by having the application communicate with the machine's driver through the MC SDK. In some embodiments, the application receives data for its operation and provides the result of its operation through a MC executing on the host computer on a different machine. In yet other embodiments, the method receives the data and provides its result by receiving the data and providing the result through a MC executing on another host computer.

In some embodiments, the application uses high-level APIs of the MC SDK to communicate with the hardware accelerator, and the MC SDK converts the high-level APIs to low-level APIs that are needed to communicate with the machine's driver, which, in turn, relays the communication to the hardware accelerator. The low-level APIs are provided by a first company associated with the sale of the hardware accelerator, while the MC SDK is provided by a second company associated with the distribution of the MC SDK.

The method of other embodiments has the application's associated hardware accelerator driver communicate with virtualized hardware accelerators that are offered by an intervening virtualization application (e.g., hypervisor) that executes on the same host computer as the application. For instance, the method of some embodiments deploys a virtualization application on a host computer for sharing resources of the host computer among several machines executing on the host computer. This computer has a first set of one or more physical hardware accelerators.

The method deploys several applications on several machines to perform several O-RAN related operations for a set of O-RAN components. Through the virtualization application, the method defines a second set of two or more virtual hardware accelerators that are mapped to the first set of physical hardware accelerators by the virtualization application. The method assigns different virtual hardware accelerators to different applications. The method also configures the applications to use their assigned virtual hardware accelerators to perform their operations.

In some embodiments, the deployed machines are Pods, and the applications are deployed to execute on the Pods. At least two Pods execute on one VM that executes above the virtualization application. This VM includes a hardware accelerator driver that is configured to communicate with two different virtual hardware accelerators for the two applications executing on the two Pods. In other embodiments, multiple Pods execute on one VM that executes above the virtualization application, and each Pod has a hardware accelerator driver that is configured to communicate with a virtual hardware accelerator that is assigned to that driver.

Some embodiments provide a novel method for deploying O-RAN applications. To install an application on a host computer, the method selects a set of installation files that includes a description for configuring passthrough access for the application to a hardware accelerator of the host computer. The method then uses the set of installation files to install the application on the host computer. This installation entails configuring, based on the description relating to the passthrough access, a program executing on the host computer to pass calls from a particular hardware accelerator driver associated with the application to the hardware accelerator without going through an intervening set of one or more drivers for the hardware accelerator that execute on the host computer between the particular hardware accelerator driver and the hardware accelerator. This configuration allows the application to bypass the intervening set of drivers when directing the hardware accelerator to perform operation for the application and to receive the results of the operations from the hardware accelerator.

In some embodiments, the application performs operations offloaded by an O-RAN E2 unit or is an edge application of the O-RAN. On the host computer, the application executes over the program, which in some embodiments is the host's operating system, while in other embodiments is a hypervisor executing on the host computer. In still other embodiments, the program is a virtual machine (VM) and the application operates on a Pod or container that executes on the VM.

Before performing the selecting and configuring, the deployment method of some embodiments identifies the host computer from several host computers as the computer on which the application should be installed. The method in some embodiments identifies the host computer by determining that the application requires a hardware accelerator, identifying a set of host computers that each comprise a hardware accelerator, and selecting the host computer from the set of host computers. The method selects the host computer by (1) determining that the application will need to communicate with a set of one or more other applications that execute on the selected host computer, and (2) selecting the host computer as the set of other applications already executes on the host computer. This installation of the application with the set of other applications on the selected host computer reduces communication delay between the application and the set of other applications.

The preceding Summary is intended to serve as a brief introduction to some embodiments of the invention. It is not meant to be an introduction or overview of all inventive subject matter disclosed in this document. The Detailed Description that follows and the Drawings that are referred to in the Detailed Description will further describe the embodiments described in the Summary as well as other embodiments. Accordingly, to understand all the embodiments described by this document, a full review of the Summary, Detailed Description and the Drawings is needed. Moreover, the claimed subject matters are not to be limited by the illustrative details in the Summary, Detailed Description and the Drawings, but rather are to be defined by the appended claims, because the claimed subject matters can be embodied in other specific forms without departing from the spirit of the subject matters.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth in the appended claims. However, for purpose of explanation, several embodiments of the invention are set forth in the following figures.

FIG. 1 illustrates an example of O-RAN architecture according to some embodiments.

FIG. 2 illustrates an in-depth view of the components of both a non-real-time RIC and a near real-time RIC according to some embodiments.

FIG. 3 illustrates a more in-depth view of a MAC control assistor of some embodiments.

FIG. 4 illustrates a more in-depth view of a user-level tracer of some embodiments.

FIG. 5 illustrates another view of the O-RAN architecture of some embodiments, with a more in-depth view of the near real-time RIC.

FIG. 6 illustrates deployment of RIC SDKs on machines that execute control plane applications in some embodiments.

FIG. 7 illustrates that some embodiments deploy several RICs to execute on several host computers to implement a distributed near RT RIC that includes the RIC components illustrated in FIGS. 5 and 6.

FIG. 8 illustrates a RIC that executes on one host computer along with two machines on which two control plane applications execute.

FIG. 9 illustrates two RICs that execute on two host computer along with two machines on which two control plane applications and two RIC SDKs execute.

FIG. 10 illustrates a RIC that executes on a first host computer to connect two control plane applications that execute on two machines operating on two other host computers.

FIG. 11 illustrates a RIC that executes on a first host computer to connect two control plane applications that execute on two machines, one of which operates on the first host computer while the other operates on another host computer.

FIG. 12 illustrates examples of the different standard specified APIs that the distributed near RT MC platform of some embodiments supports.

FIG. 13 illustrates embodiments in which the SDL cache is part of each RIC SDK that executes on the same machine as its control plane application.

FIG. 14 illustrates an example of control or edge applications that have passthrough access to hardware accelerator of their host computer to perform some or all of their computations.

FIG. 15 illustrates a process that is performed in some embodiments in response to an O-RAN component directing a CP or edge application to perform an operation that requires the application to use a hardware accelerator of its host computer.

FIG. 16 illustrates an application performing an operation based on data from an E2 node.

FIG. 17 illustrates another example of a control or edge applications that have passthrough access to a hardware accelerator of their host computer to perform some (or all) of their computations.

FIG. 18 illustrates yet another example of CP or edge applications that has passthrough access to a hardware accelerator of their host computer to perform some or all of their computations.

FIG. 19 illustrates a process that some embodiments use to deploy O-RAN applications with direct, passthrough access to the hardware accelerators of their host computers.

FIG. 20 illustrates an example of CP or edge applications that have passthrough access to virtual hardware accelerator defined by a hypervisor executing on their host computer.

FIG. 21 conceptually illustrates an electronic system with which some embodiments of the invention are implemented.

DETAILED DESCRIPTION

In the following detailed description of the invention, numerous details, examples, and embodiments of the invention are set forth and described. However, it will be clear and apparent to one skilled in the art that the invention is not limited to the embodiments set forth and that the invention may be practiced without some of the specific details and examples discussed.

Today, there is a push to have RAN implemented as O-RAN, a standard for allowing interoperability for RAN elements and interfaces. FIG. 1 illustrates an example of O-RAN architecture 100, according to some embodiments. The O-RAN architecture 100 includes a service management and orchestration framework (SMO) 110 with a non-real-time RIC 105, a near real-time RAN intelligent controller (RIC) 115, open control plane central unit (O-CU-CP) 120, open user plane central unit (O-CU-UP) 125, open distributed unit (O-DU) 130, open radio unit (O-RU) 135, and the O-Cloud 140. The O-CU-CP 120, the O-CU-UP 125, and the O-DU 130 may be collectively referred to as the managed functions 120-130 below.

As defined in the standard, the SMO 110 in some embodiments includes an integration fabric that allows the SMO to connect to and manage the RIC 115, the managed functions 120-130, and the O-Cloud 140 via the open interfaces 150. Unlike these elements, the O-RU 135 is not managed by the SMO 110, and is instead managed by the O-DU 130, as indicated by the dashed line 160, in some embodiments. In some embodiments, the O-RU 135 processes and sends radio frequencies to the O-DU 130.

In some embodiments, the managed functions 120-130 are logical nodes that each host a set of protocols. According to the O-RAN standard, for example, the O-CU-CP 120, in some embodiments, include protocols such as radio resource control (RRC) and the control plane portion of packet data convergence protocol (PDCP), while the O-CU-UP 125 includes protocols such as service data adaptation protocol (SDAP), and the user plane portion of packet data convergence protocol (PDCP).

The two RICs are each adapted to specific control loop and latency requirements. The near real-time MC 115 provides programmatic control of open centralized units (O-CUs) and open distributed units (O-DUs) on time cycles of 10 ms to 1 second. The non-real-time MC (non-RT MC) 105, on the other hand, provides higher layer policies that can be implemented in the RAN either via the near-RT MC or via a direct connection to RAN nodes. The non-RT MC is used for control loops of more than 1 second. Each MC 105 or 115 serves as a platform on which RAN control applications execute. These applications can be developed by third-party suppliers that are different from the RIC vendors. These applications are referred to as “xApps” (for the near-RT RIC 115) and “rApps” (for the non-RT RIC).

The near real-time RIC 115, in some embodiments, is a logical aggregation of several functions that use data collection and communications over the interfaces 155 in order to control the managed functions 120-130. In some embodiments, the non-real-time RIC 105 uses machine learning and model training in order to manage and optimize the managed functions 120-130. The near RT RIC in some of these embodiments also uses machine learning.

In some embodiments, the O-Cloud 140 is responsible for creating and hosting virtual network functions (VNFs) for use by the RIC 115 and the managed functions 120-130. In some embodiments, the DU is in charge of per-slot decisions of user scheduling and includes RAN scheduler that performs MAC control assistance and user-level tracing. In order to increase computing power available in the cloud (i.e., compared to base stations that typically execute the RAN functions), the RIC is implemented in one or more public and/or private cloud datacenters and implements an improved cloudified RAN scheduler in the cloud, thereby offloading these MAC control assistance and user-level tracing functions from the DU to the RIC. The interfaces 155 in some embodiments enable the RAN to provide inputs to the functions at the RIC, and, at least in some embodiments, receive outputs that have been computed by these functions at the RIC.

FIG. 2 illustrates an in-depth view of the components of both a non-real-time RIC 201 and a near real-time RIC 202. Each of the RICs 201 and 202 includes a respective set of analytics functions 210 and 212, and a respective set of optimization functions 214 and 216, which are each illustrated with dashed lines to indicate they are existing components. In addition to these existing components, the near real-time optimization functions 216 includes two new components, the MAC control assistor 220 and user-level tracer 222, illustrated with solid lines to visually differentiate them from the existing components. In some embodiments, these components are part of a larger MIMO component (e.g., along with the MU-MIMO UE pairer and precoder).

In some embodiments, the MAC control assistor 220 can include various functions such as (1) User Equipment (UE)-specific beamforming weight calculation based on UL SRS channel signal reception, (2) UE Radio Frequency (RF) condition prediction, and (3) Multi-User, Multiple Input, Multiple Output (MU-MIMO) pairing suggestion for the MAC scheduler based on the UE-specific beams. For each of these functions, some embodiments expose a report interface (that provides input data for the function to the RIC from the DU) and a control interface (that provides output data for the function to the DU from the RIC).

The user-level tracer 222, in some embodiments, produces L1/L2/L3 level information related to user configuration and traffic performance. This tracing data can be used as inputs to various control algorithms, including the MAC scheduler, parameter setting, etc. The user-level tracer 222 can include tracing operations that can (i) track user behavior in a cell, (ii) track user RF condition, (iii) track user data traffic performance in different layers (MAC, Radio Link Control (RLC), Packet Data Convergence Protocol (PDCP)), and (iv) track user RF resource consumption.

FIG. 3 illustrates a more in-depth view of a MAC control assistor 300 of some embodiments. As illustrated, the MAC control assistor 300 includes a UE-specific beamforming weight calculator (BFWC) 310, a UE RF condition predictor 320, and a MU-MIMO pairing suggestor 330. The UE-specific BFWC 310 in some embodiments is based on UL SRS channel signal reception. In some embodiments, the MU-MIMO pairing suggestor 330 is for the MAC scheduler based on the UE-specific beams.

Each of the components 310-330 of the MAC control assistor 300 includes an uplink and a downlink, as shown. For the UE-specific BWC function, some embodiments expose a report interface for an uplink Sounding Reference Signal (UL SRS) channel response matrix that is an input to the weight calculation function and a control interface for a UE-specific beamforming weight matrix. For the UE RF condition predictor function, some embodiments expose a report interface for a downlink (DL) channel condition report that is an input to the RF condition prediction and a control interface for a predicted DL channel condition (e.g., including DL SINR, PMI, and rank) for the next scheduling window. For the MU-MIMO pairing suggestion function, some embodiments expose a report interface for UE-specific beamforming weight matrix that is an input to the pairing suggestion function and a control interface for UE pairing suggestion and SINR impact assessment.

FIG. 4 illustrates a more in-depth view of a user-level tracer 400 of some embodiments. The tracer 400 includes multiple uplinks 410 and multiple downlinks 415 for performing tracing operations, in some embodiments. These operations produce L1/L2/L3 level information related to user configuration and traffic performance. This tracing data can be used as inputs to various control algorithms, including the MAC scheduler, parameter setting, etc. These tracing operations can (1) track user behavior in a cell, (2) track user RF condition, (3) track user data traffic performance in different layers (MAC, RLC, PDCP), and (4) track user RF resource consumption.

For these tracing operations, some embodiments expose report interfaces for the DU and/or the CU to provide various metrics to the user level tracing operations. These metrics can include selected RRC messages, MAC/RLC/PDCP traffic volume and performance, RF condition, and RF resource consumption. In some embodiments, messages over these interfaces to the RIC are triggered based on user behavior and/or periodic reporting (e.g., for traffic performance and RF condition/resource consumption).

The tracing operations track the various user data indicated above, and can provide this information either back to the RAN or to other control algorithms (e.g., other algorithms operating at the RIC). For instance, these algorithms might perform analysis on the user data performance from the user level tracing operations, determine that certain performance is inadequate, and modify how the RAN is treating the user traffic. Examples of control algorithms that can benefit from user-level tracing in some embodiments include (1) traffic steering, (2) quality of service (QoS) scheduling optimization, (3) user configuration adjustment, and (4) user behavior anomaly detection.

For all of the operations described in FIGS. 3-4 (i.e., the MAC scheduler functions and the user-level tracing operations), the increased computing power available to the RIC in the cloud enables more complex computations without excessive latency. For instance, some or all of these operations can be performed at the RIC using machine learning (e.g., using machine-trained networks, etc.).

FIG. 5 illustrates another view of the O-RAN architecture of some embodiments, with a more in-depth view of the near real-time MC. The architecture 500 includes an SMO 505 with a non-real-time MC 510, a distributed near real-time MC 515, and E2 nodes 520 (e.g., O-DU and/or O-CU nodes). The distributed near real-time MC 515 includes messaging infrastructure 540, a set of services (e.g., 550, 552, 554, and 556), a shared data layer 560, a database 570, and a set of termination interfaces (e.g., 580, 582, and 584). As shown, a set of embedded apps (e.g., 530, 532, and 534) uses this distributed near RT RIC. As further described below, the distributed near RT RIC 515 is implemented by multiple RICs executing on multiple host computers in some embodiments.

As shown, the set of services include conflict mitigation services 550, app subscription management services 552, management services 554, and security services 556. Additionally, the set of termination interfaces include O1 termination interface 580 connecting the SMO to the near real-time RIC, A1 termination interface 582 connecting the non-real-time RIC to the near real-time RIC, and E2 termination interface 584 connecting the E2 nodes to the near real-time RIC. Each of the apps, in some embodiments, is representative of the various functions of the RIC that use data sent from the E2 nodes 520. For example, app 530 may correspond to the UE-specific BFWC 310 of the MAC control assistor 300, app 532 may correspond to the UE RF condition predictor 320 of the MAC control assistor 300, etc.

In some embodiments, the objective of the framework 500 is to offload near real-time functions that are computation-intensive, and provide results back to the O-DU (e.g., via the E2 interface with E2 nodes 520). The results, in some embodiments, can be used to assist or enhance the real-time decision in the MAC layer. Three example use-cases for the MAC control assistance framework, each example specific to a different component of the MAC control assistor (e.g., the UE-specific BFWC, the UE RF condition predictor, and the MU-MIMO pairing suggestor), and one use-case example for the user-level tracer, will be described below.

The first example use-case is specific to the UE-specific beamforming weight calculation based on UL SRS signal reception component of the MAC control assistance framework (e.g., component 310 of the MAC control assistor 300). In some embodiments of this use-case, the input metrics can include multiple options based on UL SRS, such as raw SRS received data, and an SRS channel responses matrix from a channel estimate.

The algorithm for producing output metrics, in some embodiments, evaluates the optimal beam-forming weights to reach the user. Some embodiments use traditional signal processing algorithms that are based on channel models. Alternatively, or conjunctively, machine-learning based algorithms that utilize raw data inputs are used, which require feedback from the DU in the E2 nodes 520.

In some embodiments, the output metrics resulting from the algorithm include a beam-form weight (BFW) matrix for the user. In some embodiments, the BFW could also be mapped to a beam index from a pre-designed beam set. The DU in some embodiments uses the matrix to control the MIMO antenna array gain/phasing in the RU (e.g., the O-RU 135 in the architecture 100) for user data transmission and reception.

The second use-case example is specific to the UE RF condition predictor component of the MAC control assistance framework (e.g., component 320 of the MAC control assistor 300). For this second use-case, the input metrics include at least a channel report from the UE, such as Wideband or Subband CQI/PMI/RI for DL, or SRS for UL, according to some embodiments. The input metrics of some embodiments can also opt to include supportive information such as UE distance, UE positioning, etc.

In some embodiments, the app algorithm for this second use-case is meant to predict the UE's RF condition based on the observation. Some embodiments utilize traditional signal processing algorithms based on channel and mobility models. Alternatively, or conjunctively, some embodiments also use machine learning based algorithms using data inputs and potentially other factors, such as site layout (which requires feedback from the DU).

The output metrics for this use-case, in some embodiments, include the predicted channel condition of the user for the next scheduling window, as well as predicted downlink and uplink SINR, a precoding matrix (e.g., if applicable), and SU-MIMO layers. In some embodiments, these output metrics are used by the DU for the user link adaptation on PDCCH/PDSCH/PUSCH transmissions.

The third use-case example is specific to the MU-MIMO pairing suggestor to MAC scheduler component (e.g., component 330 of the MAC control assistor 300). The input metrics for this example use case, in some embodiments, include at least the UE-specific BFW matrix and the UE RF condition estimate. Some embodiments may also include supportive metrics such as user data demand, etc., as input metrics in addition to the UE-specific BFW matrix and the UE RF condition estimate.

The app algorithm for this use-case, in some embodiments, is meant to identify users that can be paired for MU-MIMO operations. For example, some embodiments of the third use-case use traditional signal processing algorithms based on information theory and cross-channel covariance evaluation. Alternatively, or conjunctively, some embodiments use machine learning based algorithms using the data inputs, which again requires feedback from the DU.

In some embodiments, the output metrics of this third use-case can include UE pairing suggestions and an impact assessment on SINR and SU-MIMO layers. Additionally, the DU in some embodiments uses the output metrics to select users for RF scheduling, and to determine the transmission efficiencies.

An example use-case for the user-level tracer can include QoS scheduling optimization with the goal of adjusting a user's scheduling priority for an RF resource to optimize the service quality. The input for some embodiments of this use-case can include a service quality target from a user subscription. In some embodiments, the user-level tracing includes (1) tracking the user RF condition, (2) tracking the user data traffic performance in different layers (e.g., MAC/RLC/PDCP), and (3) tracking the user RF resource consumption.

In some embodiments, the app algorithm is based on the QoS target and observed user traffic performance, and can be used to determine that a user's resource allocation is insufficient. The algorithm format, in some embodiments, can be logic-based or machine learning-based. In some embodiments, the output can include a recommendation issued to the MAC scheduler to adjust the traffic priority or link adaptation in order to improve performance.

On each machine (e.g., each VM or Pod) that executes a control plane application, some embodiments configure a RIC SDK to serve as an interface between the control plane application on the machine and a set of one or more elements of the RAN. In some embodiments, the RIC SDK provides a set of connectivity APIs (e.g., a framework) through which applications can communicate with the distributed near real-time (RT) RIC implemented by two or more near real-time RICs. Examples of such applications include xApps, and other control plane and edge applications in some embodiments. In O-RAN, xApps perform control plane, monitoring and data processing operations. The discussion below regarding FIGS. 6 and 8-20 refers to control plane applications (e.g., 615, 815, 820, 915, 920, etc.). These control plane applications are xApps in an O-RAN system in some embodiments.

FIG. 6 illustrates deployment of RIC SDKs 605 on machines 610 that execute control plane applications 615 in some embodiments. As shown, one or more machines 610 execute on each of several host computers 607 in one or more datacenters. In some embodiments, the MC SDK 605 on each machine 610 includes a set of network connectivity processes that establish network connections to the set of RAN elements (e.g., E2 nodes 520, shared data layer 560, management services 554, SMO 505, etc.) for the control plane application. The MC SDK processes allow the control plane application on their machine to forego performing network connectivity operations. In some embodiments, the set of network connectivity processes of each RIC SDK of each machine establishes and maintains network connections between the machine and the set of RAN elements used by the control plane application of the machine, and handles data packet transport to and from the set of RAN elements for the control plane application.

The control plane application on each machine communicates with the set of RAN elements through high-level APIs 620 that the RAN SDK converts into low-level APIs 625. In some embodiments, at least a subset of the low-level API calls 625 are specified by a standard specifying body. Also, in some embodiments, the high-level APIs 620 are made in a high-level programming language (e.g., C++), while the low-level API calls comprise low-level calls that establish and maintain network connections and pass data packets through these connections.

The set of RAN elements that the RIC SDK connects with the control plane application on its machine in some embodiments include RAN elements that are produced and/or developed by different RAN vendors and/or developers. These RAN elements include CUs 630 and DUs 635 of the RAN in some embodiments. Also, this SDK communicates with the CUs and DUs through the low-level, standard-specified E2 interface, while the control plane application on the machine uses high-level API calls to communicate with the CUs and DUs through the RAN SDK. In some embodiments, the high-level API calls specifying E2 interface operations at a high-level application layer that do not include low-level transport or network operations.

Conjunctively, or alternatively, the set of RAN elements that the RIC SDK connects with the control plane application 615 on its machine 610 include network elements of the RIC. Again, these network elements in some embodiments include RAN elements that are produced and/or developed by different RAN vendors and/or developers. These RIC elements in some embodiments include shared data layer (SDL) 560, datapath input/output (I/O) elements, and application and management services 552 and 554 in some embodiments. FIG. 7 illustrates that some embodiments deploy several near RT RICs 705 to execute on several host computers to implement a distributed near RT RIC 700 that includes the RIC components illustrated in FIGS. 5 and 6. In some embodiments, one RIC 705 executes on each host computer that also executes a control plane application 615. In other embodiments, a control plane application 615 can execute on a host computer that does not execute a MC. For instance, in some embodiments, one or more control plane applications execute on one or more host computers that have graphics processing units (GPUs), while RICs do not execute on such host computers as they do not need the processing power of the GPUs.

Through the distributed near RT RIC, the RIC SDK also connects its control plane application to other control plane applications executing on other machines. In other words, the RIC SDK and the distributed near RT RIC in some embodiments serve as communication interface between the control plane applications. In some embodiments, the different control plane applications are developed by different application developers that use the common set of RIC APIs to communicate with each other through the distributed near RT RIC. In some of these embodiments, the distributed near RT RIC adds one or more parameters to the API calls as it forwards the API calls from one control application to the other control application.

FIGS. 8-11 illustrate several examples of RIC architectures in which the RIC SDK and the distributed near RT RIC establish the communication interface between control plane applications. These architectures are mutually exclusive in some embodiments, while in other embodiments two or more of these architectures are used conjunctively. FIG. 8 illustrates a RIC 800 that executes on one host computer 805 along with two machines 810 and 812 on which two control plane applications 815 and 820 execute. Through the MC SDKs 802 and 804 executing on the machines 810 and 812, the MC 800 receives API calls from the CP application 815 and forwards the API calls to the CP application 820, and passes responses to these API calls from the second CP application 820 to the first CP application 815. It also passes API calls from the second CP application 820 to the first CP application 815, and responses from the first CP application 815 to the second CP application 820.

FIG. 9 illustrates two RICs 900 and 901 that execute on two host computer 905 and 907 along with two machines 910 and 912 on which two control plane applications 915 and 920 and two MC SDKs 902 and 904 execute. As shown, API calls from the first CP application 915 to the second CP application 920 are forwarded through the first MC SDK 902, the first MC 900, the second MC 901 and the second MC SDK 904. The second CP application's responses to these API calls to the first CP application 915 traverse the reverse path, from the second MC SDK 904, the second MC 901, the first MC 900, and the first MC SDK 902.

The API calls from second CP application 920 to the first CP application 915 are forwarded through the second MC SDK 904, the second MC 901, the first MC 900, and the first MC SDK 902, while responses to these API calls from the first CP application 915 to the second CP application 920 are forwarded through the first MC SDK 902, the first MC 900, the second MC 901 and the second MC SDK 904.

FIG. 10 illustrates a RIC 1000 that executes on first host computer 1005 to connect two control plane applications 1015 and 1020 that execute on two machines 1010 and 1012 operating on two other host computers 1006 and 1007. Through the RIC SDKs 1002 and 1004 executing on the machines 1010 and 1012, the MC 1000 receives API calls from the CP application 1015 and forwards the API calls to the CP application 1020, and passes responses to these API calls from the second CP application 1020 to the first CP application 1015. It also passes API calls from the second CP application 1020 to the first CP application 1015, and responses from the first CP application 1015 to the second CP application 1020.

FIG. 11 illustrates a MC 1100 that executes on first host computer 1105 to connect two control plane applications 1115 and 1120 that execute on two machines 1110 and 1112 one of which operates on host computer 1105 while the other operates on host computer 1106. Through the MC SDKs 1102 and 1104 executing on the machines 1110 and 1112, the MC 1100 receives API calls from the CP application 1115 and forwards the API calls to the CP application 1120, and passes responses to these API calls from the second CP application 1120 to the first CP application 1115. Through these SDKs 1102 and 1104, the MC 1100 also passes API calls from the second CP application 1120 to the first CP application 1115, and responses from the first CP application 1115 to the second CP application 1120.

FIG. 12 illustrates examples of the different standard specified APIs that the distributed near RT MC platform of some embodiments supports. As shown, the distributed near RT MC platform 1200 in some embodiments uses the E2, O1, and A1 interfaces specified by the O-RAN standard specifying body. It uses the E2 APIs to communicate with the E2 O-RAN nodes, such as the O-CU-CPs 1202, O-CU-UPs 1204, and O-DUs 1206. It also uses the A1 APIs to communicate with the non-real-time MC platform 1208, and uses the O1 APIs to communicate the SMO 1210.

For each of these E2, A1, and O1 APIs, the MC SDKs 1215 provide high-level counterpart APIs for the control plane applications 1220 that use the MC SDKs and the distributed near RT MC platform to communicate with the E2 nodes 1202-1206, the non-real-time MC platform 1208 and the SMO 1210. FIG. 12 designates these high-level counterpart APIs for the E2, O1, and A1 interfaces with a prime sign as the E2′ API calls, O1′ API calls and A1′ API calls. These high-level counterpart APIs are not specified by a standard body, but are APIs that the MC SDK and/or distributed near RT MC convert into standard specified API calls.

FIG. 12 also shows several internal-RIC APIs for allowing the control plane applications 1220 to communicate with each other through the RIC SDKs and the distributed near RT RIC, and to communicate with one or more elements of the distributed near RT RIC (e.g., shared data layer (SDL) 560, datapath input/output (I/O) elements, and application and management services 552 and 554).

Enablement APIs are the APIs that are used in some embodiments to allow the control plane applications 1220 to communicate with each other. As described above by reference to FIGS. 8-11, these APIs are passed through the distributed near RT RIC in some embodiments. In other embodiments, these APIs allow the RIC SDKs of the control plane applications to directly communicate with each other without traversing through any other components of the distributed near RT RIC. For this reason, FIG. 12 includes a dashed bi-directional arrow between the RIC SDKs 1215 of the two control plane applications 1220 to indicate that in some embodiments the RIC SDKs 1215 of these applications communicate directly with each other.

The enablement APIs in some embodiments include registration APIs, service discovery APIs as well as inter-app communication APIs. Registration APIs are used by the applications 1220 (e.g., xApps) to introduce themselves to other applications 1220 by providing their network identifiers (e.g., their network address and available L4 ports) and providing their functionality (e.g., performing channel prediction). Service discovery APIs allow control plane applications 1220 (e.g., xApps) to query the service directory (e.g., of the distributed near RT RIC) for other control plane applications (e.g., other xApps) that provide a particular service. The inter-app communication APIs allow the control plane applications to communicate with each other to pass along data and/or request certain operations.

Some embodiments deploy an SDL cache on the same host computer as a control plane application, and use this cache to process at least a subset of the SDL storage access requests of the control plane application. In some embodiments, the control plane application and the SDL cache operate on a machine that executes on the host computer. In other embodiments, the SDL cache operates on the same host computer but outside of the machine on which the control plane application executes. In some of these embodiments, multiple control plane applications executing on the same host computer use a common SDL cache on that host computer.

The SDL cache is part of a MC that executes on the same host computer as the control plane application in some embodiments. In other embodiments, the SDL cache is part of the RIC SDK that executes on the same machine as the control plane application. In either of these embodiments, a synchronizing process of the RIC or the RIC SDK synchronizes the data stored in the SDL cache with the data stored in the SDL storage.

In some embodiments, the SDL storage operates on a different host computer than the host computer on which the control plane application executes, while in other embodiments at least a portion of the SDL storage operates on the same host computer on which the control plane application executes. Also, in some embodiments, the RIC or the RIC SDK forwards SDL access requests from the control plane application to the SDL storage when the RIC SDK cannot process the SDL access requests through the SDL cache. For instance, the RIC or the RIC SDK cannot process SDL access requests through the SDL cache when the SDL cache does not store data requested by the control plane application.

FIG. 13 illustrates embodiments in which the SDL cache 1302 is part of each RIC SDK 1300 that executes on the same machine 1305 as its control plane application 1310. As shown, the RIC SDK 1300 includes a query manager 132 that processes SDL requests from the CP application 1310 and a synchronizing service 1327 that synchronizes the data stored in the SDL cache with the data stored in an SDL storage 1350 of the SDL 1355 of the distributed near RT RIC 1330. In this example, the SDL storage 1350 operates on a different host computer than the host computer on which the control plane application 1310 executes. However, in other embodiments, at least a portion of the SDL storage 1350 operates on the same host computer on which the control plane application 1310 executes.

When the control plane application 1310 uses a high-level API call to read or write data to the SDL storage, the query manager 1325 of the MC SDK 1300 first determines whether the data record being read or written is stored in the SDL cache 1302. If so, the query manager 1325 reads from or write to this record. When this operation is a write operation, the synchronizing service 1327 writes the new data in real-time or on batch basis to the SDL storage 1350. On the other hand, when query manager 1325 of the MC SDK 1300 determines that the data record being read or written is not stored in the SDL cache 1302, it passes the API call to the SDL layer of the distributed near RT MC to perform the requested read or write operation. When passing this API call, the MC SDK 1300 modifies the format of this call and/or modifies the parameters supplied with this call in some embodiments.

Some embodiments provide various methods for offloading operations in an O-RAN (Open Radio Access Network) onto control plane (CP) or edge applications that execute on host computers with hardware accelerators in software defined datacenters (SDDCs). For instance, at the CP or edge application operating on a machine executing on a host computer with a hardware accelerator, the method of some embodiments receives data, from an O-RAN E2 unit, for which it has to perform an operation. The method uses a driver of the machine to communicate directly with the hardware accelerator to direct the hardware accelerator to perform a set of computations associated with the operation. This driver allows the communication with the hardware accelerator to bypass an intervening set of drivers executing on the host computer between the machine's driver and the hardware accelerator. Through this driver, the application in some embodiments receives the computation results, which it then provides to one or more O-RAN components (e.g., to the E2 unit that provided the data, to another E2 unit or to another xApp).

FIGS. 14-20 illustrate several different embodiments for offloading O-RAN operations to CP or edge applications that have passthrough access to the hardware accelerators of their host computers. Examples of such a hardware accelerator include a graphical processing unit (GPU), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), and a structured ASIC.

FIG. 14 illustrates an example of CP or edge applications 1402 that have passthrough access to hardware accelerator 1450 of their host computer 1410 to perform some or all of their computations. As shown, each application 1402 executes on a Pod 1404, which has accelerator drivers 1412 with direct, passthrough access to the accelerator 1450 of their host computer 1410. Each Pod 1404 operates within (i.e., execute on) a VM 1406, which, in turn, executes over a hypervisor 1408 of the host computer.

In some embodiments, a Pod is a small deployable unit of computing that can be created and managed in Kubernetes. A Pod includes a group of one or more containers with shared storage and network resources, and a specification for how to run the containers. In some embodiments, a Pod's contents are always co-located and co-scheduled, and run in a shared context. A Pod models an application-specific logical host computer; it contains one or more application containers that are communicate with each other. In some embodiments, the shared context of a Pod is a set of an operating system namespaces (e.g., Linux cgroups). Within a Pod's context, the individual applications may have further sub-isolations applied.

Each Pod's accelerator driver 1412 has direct accesses to the hardware accelerator 1450, and this access bypasses the hardware accelerator drivers 1414 and 1416 of the VM 1406 and the hypervisor 1408. In some embodiments, the hypervisor 1408 executes over an operating system (not shown) of the host computer 1410. In these embodiments, the direct access of each Pod's accelerator driver 1412 to the hardware accelerator 1450 also bypasses the hardware accelerator driver of the operating system.

To communicate with the hardware accelerator, each application 1402 in some embodiments communicates through the RIC SDK 1430 executing on its Pod. For instance, in some embodiments, each application 1402 uses high-level APIs of the RIC SDK 1430 to communicate with the hardware accelerator 1450. The RIC SDK 1430 then converts the high-level APIs to low-level APIs that are needed to communicate with machine's driver 1412, which, in turn, relays the communication to the hardware accelerator 1450. The low-level APIs are provided by a first company associated with the sale of the hardware accelerator 1450, while the RIC SDK 1430 is provided by a second company associated with the distribution of the RIC SDK 1430. In some embodiments, the low-level APIs used by the RIC SDK 1430 are APIs specified in an API library 1432 associated with the hardware accelerator 1450.

FIG. 15 illustrates a process 1500 that implements the method of some embodiments. The process 1500 is performed in response to an O-RAN component directing a CP or edge application to perform an operation that requires the application to use a hardware accelerator of its host computer. This process 1500 will be described below by reference to FIG. 16, which illustrates the application 1402 performing an operation based on data received from an E2 node 1650.

As shown in FIG. 15, the process 1500 starts when the application 1402 (at 1505) receives a data from an O-RAN E2 unit 1650 executing on the host computer 1610. In some embodiments, the application 1402 subscribes for data from the E2 unit 1650, and the data received at 1505 is in response to this subscription. This subscription is made through the distributed near RT MC in some embodiments. The host computers 1410 and 1610 of the application 1402 and the E2 unit 1650 operate in one SDDC in some embodiments. In other embodiments, these two host computers 1410 and 1610 operate in two different physical locations. For example, the host computer 1410 operates in a first location, while the host computer 1610 operates at a second location close to a cell site of the O-RAN. In some embodiments, the second location does not have computers with hardware accelerators that perform complex operations including the received operation.

The application 1402 receives (at 1505) the data from the E2 unit 1650 through (1) the distributed near RT RIC 1680 formed by near RT RICs 1640 and 1645 executing on host computers 1410 and 1610, and (2) the RIC SDK 1430 executing on its Pod 1404. The application 1402 then uses (at 1510) the hardware accelerator 1450 to perform a set of computations associated with the operation.

To communicate with the hardware accelerator 1450, the application 1402 uses high-level APIs provided by the RIC SDK 1430. The RIC SDK 1430 then converts the high-level APIs to low-level APIs specified in the API library 1432 associated with the hardware accelerator 1450. These low-level APIs are then communicated to the hardware accelerator 1450 by the Pod's driver 1412 through its direct, passthrough access to the accelerator 1450, which bypasses the drivers 1414 and 1416 of the VM 1406 and hypervisor 1408. Through this driver 1412, the APIs specified in the API library 1432, and the RIC SDK 1430, the application 1402 also receives the results of the operations (e.g., computations) performed by the hardware accelerator 1450.

The application 1402 provides (at 1515) the result of its operation to one or more O-RAN components, such as the E2 unit 1650 that provided the data that started the process 1500 or the SDL storage. This result is provided through the RIC SDK 1430 and the distributed near RT RIC 1680. In other embodiments, the application 1402 (through the RIC SDK 1430) provides the results of its operation to one or more other applications (applications other than the E2 unit that provided the data for which the application performed its operation) operating on another O-RAN E2 unit or machine executing on the same host computer or on another host computer as the application that uses the hardware accelerator 1450 to perform the operation. The process 1500 ends after 1515.

Other embodiments use the passthrough access for the O-RAN control or edge application in other deployment settings. For instance, FIG. 17 illustrates another example of CP or edge applications 1702 that have passthrough access to a hardware accelerator 1750 of their host computer 1710 to perform some (or all) of their computations. In this example, each application 1702 (1) executes on a Pod 1704 that executes on a VM 1706, and (2) uses the accelerator driver 1712 of this VM 1706 which has direct, passthrough access to the accelerator 1750 of its host computer 1710. The VM 1706 executes over a hypervisor 1708 operating on the host computer 1710. The VM's accelerator driver 1712 bypasses the hardware accelerator drivers 1716 of the hypervisor 1708. In some embodiments, the hypervisor 1708 executes over an operating system (not shown) of the host computer 1710. In these embodiments, the direct access of the VM's accelerator driver 1712 to the hardware accelerator 1750 bypasses the hardware accelerator driver of the operating system.

To use the hardware accelerator 1750, each application 1702 in some embodiments uses high-level APIs of the RIC SDK 1730 (executing on its Pod 1704) to communicate with the hardware accelerator 1750. The RIC SDK 1730 converts the high-level APIs to low-level APIs that are needed to communicate with VM's driver 1712, which, in turn, relays the communication to the hardware accelerator 1750. In some embodiments, the low-level APIs used by the RIC SDK 1730 are APIs specified in an API library 1732 associated with the hardware accelerator 1750. This API library 1732 is part of the driver interface of the VM 1706.

FIG. 18 illustrates yet another example of CP or edge applications 1802 that has passthrough access to a hardware accelerator 1850 of their host computer 1810 to perform some or all of their computations. In this example, each application 1802 (1) executes on a VM 1804 that executes on a hypervisor 1806 operating on the host computer 1810, and (2) uses the accelerator driver 1812 of its VM 1804, which has direct, passthrough access to the accelerator 1850 of its host computer 1810.

The VM's accelerator driver 1812 bypasses the hardware accelerator drivers 1816 of the hypervisor 1806. In some embodiments, the hypervisor 1806 executes over an operating system (not shown) of the host computer 1810. In these embodiments, the direct access of the VM's accelerator driver 1812 to the hardware accelerator 1850 bypasses the hardware accelerator driver of the operating system.

To use the hardware accelerator 1850, each application 1802 in some embodiments uses high-level APIs of the RIC SDK 1830 (executing on its Pod 1804) to communicate with the hardware accelerator 1850. The MC SDK 1830 converts the high-level APIs to low-level APIs that are needed to communicate with the VM's driver 1812, which, in turn, relays the communication to the hardware accelerator 1850. In some embodiments, the low-level APIs used by the MC SDK 1830 are APIs specified in an API library 1832 associated with the hardware accelerator 1850. This API library 1832 is part of the driver interface of the VM 1806.

One of ordinary skill will realize that the passthrough access for the O-RAN control or edge application is used in other deployment settings in other embodiments. For instance, instead of operating on Pods, the applications in other embodiments operate on containers. These embodiments then use the hardware accelerator drivers of their Pods or VMs to have passthrough access to the hardware accelerators for the control or edge application. In some of these embodiments, the control or edge application communicates with the hardware accelerator through its associated RIC SDK, and communicates with other O-RAN components (to receive data and to provide results of its processing of the data) through its associated RIC SDK and the distributed near RT RIC connecting the O-RAN components and the application. In some embodiments, the control or edge application in these embodiments performs processes similar to process 1500 of FIG. 15.

The above-described direct, passthrough access to hardware accelerators is quite beneficial for O-RANs. The RIC is all about decoupling the intelligence that used to be embedded within the RAN software (CU and DU) and moving it to the cloud. One benefit of this is to use more advanced computing in the cloud for the xApp and edge operations (e.g., for ML, deep learning, reinforcement learning for control algorithms, etc.). A DU close to a cell site typically cannot run advance computations because it would not be economically feasible to put GPUs at each cell site as network cap X will be very high.

By using the hardware accelerator (GPU, FPGAs, eASICs, ASICs) in the SDDC, some embodiments run complex control algorithms in the cloud. Examples of such xApps include Massive MIMO beam forming and Multi-user (MU) MIMO user pairing, which were described above. Generally, any xApp whose computations can benefit from massive parallelization would gain the benefit of GPU or other accelerators. The use of ASICs is beneficial for channel decoding/encoding (turbo encoding, LDPC encoding, etc.). In some embodiments, the RIC is typically on the same worker VM as xApps. However, in other embodiments, the RICs executes on a different host computer so that more xApps that need GPUs and other hardware accelerators can run on the hosts with the GPUs and/or other hardware accelerators.

FIG. 19 illustrates a process that some embodiments use to deploy O-RAN applications with direct, passthrough access to the hardware accelerators of their host computers. To install an application on a host computer, the process 1900 selects (at 1905) a set of one or more installation files that includes a description for configuring passthrough access for the application to a hardware accelerator of the host computer. In some embodiments, the set of files includes one description file that specifies direct, passthrough access for the application to the hardware accelerator of its computer.

The process 1900 uses (at 1910) the set of installation files to configure, based on the description relating to the passthrough access, a program executing on the host computer to pass calls from a particular hardware accelerator driver associated with the application to the hardware accelerator without going through an intervening set of one or more drivers for the hardware accelerator that executes on the host computer between the particular hardware accelerator driver and the hardware accelerator. This configuration allows the application to bypass the intervening set of drivers when directing the hardware accelerator to perform operations for the application and to receive the results of the operations from the hardware accelerator.

The program that is configured at 1910 in some embodiments is the host's operating system, while in other embodiments it is a hypervisor executing on the host computer. In still other embodiments, the program is a virtual machine (VM) and the application operates on a Pod or container that executes on the VM. The process 1900 completes (at 1915) the installation of the application by processing the remaining set of installation files selected at 1905, and then ends. In other embodiments, the process 1900 performs the configuration of the program as its last operation instead of as its first operation at 1910. In still other embodiments, it performs this configuration as one of its intervening installation operations.

Before performing the selection and configuration, the deployment process of some embodiments identifies the host computer from several host computers as the computer on which the application should be installed. The process in some embodiments identifies the host computer by determining that the application requires a hardware accelerator, identifying a set of host computers that each comprise a hardware accelerator, and selecting the host computer from the set of host computers. The process selects the host computer by (1) determining that the application will need to communicate with a set of one or more other applications that execute on the selected host computer, and (2) selecting the host computer as the set of other applications simultaneously executes on the host computer. This installation of the application with the set of other applications on the selected host computer reduces communication delay between the application and the set of other applications.

Some embodiments have the hardware accelerator drivers of the O-RAN control or edge applications communicate with virtualized hardware accelerators that are offered by an intervening virtualization application (e.g., hypervisor) that executes on the same host computer as the application. For instance, the method of some embodiments deploys a virtualization application on a host computer for sharing resources of the host computer among several machines executing on the host computer. This computer has a first set of one or more physical hardware accelerators.

The method deploys several applications on several machines to perform several O-RAN related operations for a set of O-RAN components. Through the virtualization application, the method defines a second set of two or more virtual hardware accelerators that are mapped to the first set of physical hardware accelerators by the virtualization application. The method assigns different virtual hardware accelerators to different applications. The method also configures the applications to use their assigned virtual hardware accelerators to perform their operations.

In some embodiments, the deployed machines are Pods, and the applications are deployed to execute on the Pods. At least two Pods execute on one VM that executes above the virtualization application. This VM includes a hardware accelerator driver that is configured to communicate with two different virtual hardware accelerators for the two applications executing on the two Pods. In other embodiments, multiple Pods execute on one VM that executes above the virtualization application, and each Pod has a hardware accelerator driver that is configured to communicate with a virtual hardware accelerator that is assigned to that driver.

FIG. 20 illustrates an example of CP or edge applications 2002 that have passthrough access to virtual hardware accelerator 2052 and 2054 defined by a hypervisor 2008 executing on their host computer 2010, in order to perform some or all of their computations. As shown, each application 2002 executes on a Pod 2004, which has accelerator drivers 2012 with direct, passthrough access to virtual accelerators 2052 or 2054. Each Pod 2004 operates within (i.e., execute on) a VM 2006, which, in turn, executes over a hypervisor 2008 of the host computer 2010.

Each Pod's accelerator driver 2012 has direct access to the virtual accelerator 2052 or 2054, and this access bypasses the accelerator drivers 2014 and 2016 of the VM 2006 and the hypervisor 2008. In some embodiments, the hypervisor 2008 executes over an operating system (not shown) of the host computer 2010. In these embodiments, the direct access of each Pod's accelerator driver 2012 to the virtual accelerator 2052 or 2054 also bypasses the hardware accelerator driver of the operating system.

As shown, the virtual accelerators 2052 and 2054 communicate to the hardware accelerator 2050 through the accelerator manager 2060 of the hypervisor 2008. The accelerator manager 2060 allows the virtual accelerators 2052 and 2054 (and in turn their associated applications 2002) to share one hardware accelerator 2050, while operating with this accelerator 2050 as if it is dedicated to their respective applications and Pods 2002 and 2004. Examples of such a hardware accelerator 2050 include a graphical processing unit (GPU), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), and a structured ASIC.

To communicate with its virtual accelerator 2052 or 2054, each application 2002 in some embodiments communicates through the RIC SDK 2030 executing on its Pod 2004. For instance, in some embodiments, each application 2002 uses high-level APIs of the RIC SDK 2030 to communicate with its virtual accelerator 2052 or 2054. The MC SDK 2030 then converts the high-level APIs to low-level APIs that are needed to communicate with each machine's driver 2012, which, in turn, relays the communication to the virtual accelerator 2052 or 2054. The virtual accelerator 2052 or 2054 then relays the communications to the hardware accelerator 2050 through the accelerator manager 2060.

As mentioned above by reference to FIG. 14, in some embodiments, the low-level APIs are provided by a first company associated with the sale of the hardware accelerator 2050, while the MC SDK 2030 is provided by a second company associated with the distribution of the MC SDK 2030. In some embodiments, the low-level APIs used by the MC SDK 2030 are APIs specified in an API library 2032 associated with the hardware accelerator 2050. Each application 2002 receives the results of the operations of the hardware accelerator 2050 through the accelerator manager 2060, its virtual accelerator 2052 or 2054, its driver 2012, and its MC SDK 2030.

FIG. 21 conceptually illustrates an electronic system 2100 with which some embodiments of the invention are implemented. The electronic system 2100 may be a computer (e.g., a desktop computer, personal computer, tablet computer, server computer, mainframe, a blade computer etc.), or any other sort of electronic device. Such an electronic system 2100 includes various types of computer-readable media and interfaces for various other types of computer-readable media. Electronic system 2100 includes a bus 2105, processing unit(s) 2110, a system memory 2125, a read-only memory 2130, a permanent storage device 2135, input devices 2140, and output devices 2145.

The bus 2105 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the electronic system 2100. For instance, the bus 2105 communicatively connects the processing unit(s) 2110 with the read-only memory 2130, the system memory 2125, and the permanent storage device 2135.

From these various memory units, the processing unit(s) 2110 retrieve instructions to execute and data to process in order to execute the processes of the invention. The processing unit(s) 2110 may be a single processor or a multi-core processor in different embodiments.

The read-only-memory (ROM) 2130 stores static data and instructions that are needed by the processing unit(s) 2110 and other modules of the electronic system 2100. The permanent storage device 2135, on the other hand, is a read-and-write memory device. This device 2135 is a non-volatile memory unit that stores instructions and data even when the electronic system 2100 is off. Some embodiments of the invention use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 2135.

Other embodiments use a removable storage device (such as a floppy disk, flash drive, etc.) as the permanent storage device 2135. Like the permanent storage device 2135, the system memory 2125 is a read-and-write memory device. However, unlike storage device 2135, the system memory 2125 is a volatile read-and-write memory, such as random-access memory. The system memory 2125 stores some of the instructions and data that the processor needs at runtime. In some embodiments, the invention's processes are stored in the system memory 2125, the permanent storage device 2135, and/or the read-only memory 2130. From these various memory units, the processing unit(s) 2110 retrieve instructions to execute and data to process in order to execute the processes of some embodiments.

The bus 2105 also connects to the input and output devices 2140 and 2145. The input devices 2140 enable the user to communicate information and select commands to the electronic system 2100. The input devices 2140 include alphanumeric keyboards and pointing devices (also called “cursor control devices”). The output devices 2145 display images generated by the electronic system 2100. The output devices 2145 include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD). Some embodiments include devices such as a touchscreen that function as both input and output devices.

Finally, as shown in FIG. 21, bus 2105 also couples electronic system 2100 to a network 2165 through a network adapter (not shown). In this manner, the computer can be a part of a network of computers (such as a local area network (“LAN”), a wide area network (“WAN”), or an Intranet, or a network of networks, such as the Internet. Any or all components of electronic system 2100 may be used in conjunction with the invention.

Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer-readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media). Some examples of such computer-readable media include RAM, ROM, read-only compact discs (CD-ROM), recordable compact discs (CD-R), rewritable compact discs (CD-RW), read-only digital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM), a variety of recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (e.g., SD cards, mini-SD cards, micro-SD cards, etc.), magnetic and/or solid state hard drives, read-only and recordable Blu-Ray® discs, ultra-density optical discs, any other optical or magnetic media, and floppy disks. The computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter.

While the above discussion primarily refers to microprocessor or multi-core processors that execute software, some embodiments are performed by one or more integrated circuits, such as application-specific integrated circuits (ASICs), or field-programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions that are stored on the circuit itself.

As used in this specification, the terms “computer”, “server”, “processor”, and “memory” all refer to electronic or other technological devices. These terms exclude people or groups of people. For the purposes of the specification, the terms display or displaying means displaying on an electronic device. As used in this specification, the terms “computer-readable medium,” “computer-readable media,” and “machine-readable medium” are entirely restricted to tangible, physical objects that store information in a form that is readable by a computer. These terms exclude any wireless signals, wired download signals, and any other ephemeral signals.

While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. For instance, a number of the figures conceptually illustrate processes. The specific operations of these processes may not be performed in the exact order shown and described. The specific operations may not be performed in one continuous series of operations, and different specific operations may be performed in different embodiments. Furthermore, the process could be implemented using several sub-processes, or as part of a larger macro process.

Also, several embodiments described above only show one hardware accelerator per host computer. However, one of ordinary skill will realize that the methodology and architecture of some embodiments can be used to provide direct, passthrough access to multiple hardware accelerators on one host computer. Thus, one of ordinary skill in the art would understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims. 

We claim:
 1. A method of deploying an application in an O-RAN (Open Radio Access Network), the method comprising: selecting a set of installation files for installing the application on a host computer, the set of installation files comprising a description for configuring passthrough access for the application to a hardware accelerator of the host computer; using the set of installation files to install the application on the host computer, said using comprising configuring, based on the description, a program executing on the host computer to pass calls from a particular hardware accelerator driver associated with the application to the hardware accelerator without going through an intervening set of one or more drivers for the hardware accelerator that executes on the host computer between the particular hardware accelerator driver and the hardware accelerator; said configuring allowing the application to bypass the intervening set of drivers when directing the hardware accelerator to perform operations for the application and to receive the results of the operations from the hardware accelerator.
 2. The method of claim 1, wherein the application performs operations offloaded by an O-RAN E2 unit.
 3. The method of claim 1, wherein the application executes over the program on the host computer.
 4. The method of claim 3, wherein the program is an operating system of the host computer.
 5. The method of claim 3, wherein the program is a hypervisor executing on the host computer.
 6. The method of claim 3, wherein the program is a virtual machine (VM) and the application operates on a Pod that executes on the VM.
 7. The method of claim 3, wherein the program is a virtual machine (VM) and the application operates on a container that executes on the VM.
 8. The method of claim 1 further comprising: prior to performing the selecting and configuring, identifying the host computer from a plurality of host computers as the computer on which the application should be installed, said identifying comprising: determining that the application requires a hardware accelerator; identifying a set of host computers that each comprise a hardware accelerator; selecting the host computer from the set of host computers.
 9. The method of claim 8, wherein selecting the host computer comprises determining that the application will need to communicate with a set of one or more other applications that execute on the selected host computer; and selecting the host computer as the set of other applications already executes on the host computer, said installation of the application and the set of other applications on the selected host computer reducing communication delay between the application and the set of other applications.
 10. The method of claim 8, wherein the application is installed on the host computer in the set of host computers in order to install the applications that benefit from hardware acceleration on the set of host computers that offer the hardware acceleration.
 11. The method of claim 1, wherein the hardware accelerator is one of a graphical processing unit (GPU), field-programmable gate array (FPGA), application-specific integrated circuit (ASIC), and structured ASIC.
 12. The method of claim 1, wherein the application is a control plane application of the O-RAN.
 13. The method of claim 1, wherein the application is an edge application of the O-RAN.
 14. A non-transitory machine readable medium storing a program which when executed by at least one processing unit deploys an application in an O-RAN (Open Radio Access Network), the program comprising sets of instructions for: selecting a set of installation files for installing the application on a host computer, the set of installation files comprising a description for configuring passthrough access for the application to a hardware accelerator of the host computer; using the set of installation files to install the application on the host computer, said using comprising configuring, based on the description, a program executing on the host computer to pass calls from a particular hardware accelerator driver associated with the application to the hardware accelerator without going through an intervening set of one or more drivers for the hardware accelerator that executes on the host computer between the particular hardware accelerator driver and the hardware accelerator; said configuring allowing the application to bypass the intervening set of drivers when directing the hardware accelerator to perform operations for the application and to receive the results of the operations from the hardware accelerator.
 15. The non-transitory machine readable medium of claim 14, wherein the application performs operations offloaded by an O-RAN E2 unit.
 16. The non-transitory machine readable medium of claim 14, wherein the application executes over the program on the host computer.
 17. The non-transitory machine readable medium of claim 16, wherein the program is an operating system of the host computer.
 18. The non-transitory machine readable medium of claim 16, wherein the program is a hypervisor executing on the host computer.
 19. The non-transitory machine readable medium of claim 16, wherein the program is a virtual machine (VM) and the application operates on a Pod that executes on the VM.
 20. The non-transitory machine readable medium of claim 16, wherein the program is a virtual machine (VM) and the application operates on a container that executes on the VM. 